Bayes-Adaptive Simulation-based Search with Value Function Approximation
نویسندگان
چکیده
Bayes-adaptive planning offers a principled solution to the explorationexploitation trade-off under model uncertainty. It finds the optimal policy in belief space, which explicitly accounts for the expected effect on future rewards of reductions in uncertainty. However, the Bayes-adaptive solution is typically intractable in domains with large or continuous state spaces. We present a tractable method for approximating the Bayes-adaptive solution by combining simulationbased search with a novel value function approximation technique that generalises appropriately over belief space. Our method outperforms prior approaches in both discrete bandit tasks and simple continuous navigation and control tasks.
منابع مشابه
Function Approximation Approach for Robust Adaptive Control of Flexible joint Robots
This paper is concerned with the problem of designing a robust adaptive controller for flexible joint robots (FJR). Under the assumption of weak joint elasticity, FJR is firstly modeled and converted into singular perturbation form. The control law consists of a FAT-based adaptive control strategy and a simple correction term. The first term of the controller is used to stability of the slow dy...
متن کاملRobust adaptive control of voltage saturated flexible joint robots with experimental evaluations
This paper is concerned with the problem of design and implementation a robust adaptive control strategy for flexible joint electrically driven robots (FJEDR), while considering to the constraints on the actuator voltage input. The control design procedure is based on function approximation technique, to avoid saturation besides being robust against both structured and unstructured uncertaintie...
متن کاملSample-based Search Methods for Bayes-adaptive Planning
A fundamental issue for control is acting in the face of uncertainty about the environment. Amongst other things, this induces a trade-off between exploration and exploitation. A model-based Bayesian agent optimizes its return by maintaining a posterior distribution over possible environments, and considering all possible future paths. This optimization is equivalent to solving a Markov Decisio...
متن کاملControlling Nonlinear Processes, using Laguerre Functions Based Adaptive Model Predictive Control (AMPC) Algorithm
Laguerre function has many advantages such as good approximation capability for different systems, low computational complexity and the facility of on-line parameter identification. Therefore, it is widely adopted for complex industrial process control. In this work, Laguerre function based adaptive model predictive control algorithm (AMPC) was implemented to control continuous stirred tank rea...
متن کاملApproximating Bayes Estimates by Means of the Tierney Kadane, Importance Sampling and Metropolis-Hastings within Gibbs Methods in the Poisson-Exponential Distribution: A Comparative Study
Here, we work on the problem of point estimation of the parameters of the Poisson-exponential distribution through the Bayesian and maximum likelihood methods based on complete samples. The point Bayes estimates under the symmetric squared error loss (SEL) function are approximated using three methods, namely the Tierney Kadane approximation method, the importance sampling method and the Metrop...
متن کامل